Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification

نویسندگان

Kazuki Adachi

Tomoki Toda

Hiromichi Kawanami

Hiroshi Saruwatari

Kiyohiro Shikano

چکیده

Our reasearch goal is to construct a Japanese TTS (Text-to-Speech) system that can output various kinds of prosody. Since such synthetic speech is useful for a practical use, many TTS systems have implemented global prosodic control processing. But fundamentally they're designed to output speech with standard pitch and speech rate. We discuss synthesis method for high quality speech with extreme prosody (very high, low, fast and slow) from a viewpoint of a speech database. As a speech synthesis method, we employ a unit selection-concatenation method. We also introduce an analysis-synthesis process to give precise target prosody to output speech. Many research has reported that speech quality get worse in proportion to an amount of prosody modification by analysis-synthesis or PSOLA. Following the reports, we take an approach to reduce prosody modification of a speech segment. Nine Japanese speech databases with different characteristics in prosody are prepared. First we confirm relationship between speech quality deterioration and prosody modification, using synthetic speech with through objective and subjective tests. We also investigate relationship between a speech deterioration tendency and each speech database. The result indicates that the tendencies depend on prosodic features of original speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing Target Cost Function Based on Prosody of Speech Database

This research aims to construct a high-quality Japanese TTS (Text-to-Speech) system that has high flexibility in treating prosody. Many TTS systems have implemented a prosody control system but such systems have been fundamentally designed to output speech with a standard pitch and speech rate. In this study, we employ a unit selectionconcatenation method and also introduce an analysis-synthesi...

متن کامل

Designing speech database with prosodic variety for expressive TTS system

For the purpose of building speech synthesis system that can generate high-quality speech with wide range in prosody and realize fine prosody control, we propose new speech database constructing method. As a speech synthesis method, we select a hybrid system which consists of two part : speech unit selection and prosody modification part by STRAIGHT (vocoder type high quality analysis-synthesis...

متن کامل

Designing Japanese Speech Database Cov for Hybrid Speech Sy

For the purpose of building Text-to-Speech (TTS) system that can generate high-quality and wide range speech in prosody, we conducted speech database construction. As a speech synthesizer, we use a hybrid system which consists of a unit selection module and prosody modification by STRAIGHT (vocoder type high quality analysis-synthesis method). Our viewpoint is to reduce an amount of prosody mod...

متن کامل

Assessment of Non-native Prosody for Spanish as L2 using quantitative scores and perceptual evaluation

In this work we present SAMPLE, a new pronunciation database of Spanish as L2, and first results on the automatic assessment of Nonnative prosody. Listen and repeat and read tasks are carried out by native and foreign speakers of Spanish. The corpus has been designed to support comparative studies and evaluation of automatic pronunciation error assessment both at phonetic and prosodic level. Fo...

متن کامل

Improving speech synthesis of CHATR using a perceptual discontinuity function and constraints of prosodic modification

Concatenative synthesis is widely used in TTS to generate synthetic speech with high quality and relatively natural-sounding prosody. Whatever the type of synthesis unit used, (diphone, phoneme, etc.), a large speech database is usually needed to ensure the phonetic and phonemic variation of the units in a rich variety of contexts. In the CHATR synthesis system, unit selection nds the most appr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification

نویسندگان

چکیده

منابع مشابه

Designing Target Cost Function Based on Prosody of Speech Database

Designing speech database with prosodic variety for expressive TTS system

Designing Japanese Speech Database Cov for Hybrid Speech Sy

Assessment of Non-native Prosody for Spanish as L2 using quantitative scores and perceptual evaluation

Improving speech synthesis of CHATR using a perceptual discontinuity function and constraints of prosodic modification

عنوان ژورنال:

اشتراک گذاری